Data science is the discipline of making data useful. Ok…so what is it?
Engineering (infrastructure and production): the process of making everything else possible
Analysis: the process of turning raw information into insights in a fast way
Modeling/Inference: the process of diving deeper into the data to discover the pattern we don’t easily see
(It is a group work from https://github.com/brohrer/academic_advisory/blob/master/authors.md !)
Data environment: data storage, Kafka platform, Hadoop and Spark cluster etc.
Data management: parsing the logs, web scraping, API queries, and interrogating data streams.
Production: integrate model and analysis into the production system
Domain knowledge
Exploratory analysis
Story telling
Statistical Inference
Supervised learning
Unsupervised learning
Customized model development
Excerpt from How Airbnb Democratizes Data Science With Data University:
Team matters! > Happy teams are all alike; every unhappy team is unhappy in its own way
If you want to be Google employee #20, you need to join Google when it had only 19 employees